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Abstract 

The  United  States  Coast  Guard  uses  pooled  time  series  analysis  to  develop  a  ship  and 
aviation  fuel  requirement  foreeasting  model.  Given  the  volatility  of  aviation  fuel  priees  and  the 
USAF  dependency  on  foreign  oil,  alternative  fuel  sources  are  a  serious  consideration  and  require 
forecasting  models  when  conducting  comparison  studies.  This  research  uses  the  Coast  Guard’s 
methodology  to  develop  an  Air  Force  aviation  fuel  requirements  model  for  the  Air  Force  Cost 
Analysis  Agency  (AFCAA).  By  pooling  1,442  historical  consumption  time  series  data  points, 
two  regression  models  are  developed  that  predict  aviation  fuel  requirements  in  gallons.  The 
remaining  356  randomly  excluded  data  points  are  then  used  to  validate  the  two  regression 
models.  The  research  shows  that  100  percent  of  the  least  squares  estimated  gallons  consumed 
fell  within  a  95  percent  confidence  interval  for  the  single  and  the  sub  macro-level  models. 
However,  the  single  and  sub  macro-level  models  are  fundamentally  flawed  as  both  fail  the 
underlying  linear  regression  assumptions  of  normality,  constant  variance,  and  independence. 
Although  the  research  produces  two  models  that  predict  aviation  fuel  requirements  well,  the 
application  of  either  the  single  or  sub  macro-level  models  are  discourage  without  proper 
understanding  of  the  underlying  statistics  provided. 
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I.  Introduction 


Problem  Statement 

The  Air  Force  Cost  Analysis  Agency  (AFCAA)  is  searching  for  a  macro-level  model  that 
will  forecast  the  United  States  Air  Force  (USAF)  aviation  fuel  requirement.  The  forecasting 
model  will  serve  two  purposes.  First,  the  model  provides  a  cross-check  or  potential  replacement 
to  the  current  technique  employed.  Second,  given  the  volatility  of  aviation  fuel  prices  and  the 
USAF  dependency  on  foreign  oil,  alternative  fuel  sources  are  a  serious  consideration  and  require 
forecasting  models  when  conducting  comparison  studies.  This  research  seeks  to  determine  if 
pooled  time  series  analysis  can  develop  a  macro-level  model  to  forecast  the  baseline  Air  Force 
aviation  fuel  requirement  for  alternative  fuel  source  comparison  studies. 

General  Issue 

Annually,  the  AFCAA  forecasts  the  aviation  fuel  requirement  in  gallons.  Price  factors 
are  developed  and  published  by  The  Office  of  the  Secretary  of  Defense  (OSD)  to  convert  gallons 
to  dollars  for  budgeting  purposes.  Currently,  the  AFCAA  uses  a  five  year  historical  average  of 
aviation  fuel  consumption  data  to  determine  the  Air  Force  requirement  by  mission  design  series 
(MDS)  and  major  command  (MAJCOM). 

In  the  past,  AFCAA  has  investigated  potential  predictive  relationships  using  regression 
analysis.  However,  data  sets  at  the  MDS  by  MAJCOM  levels  are  small  and  rarely  produced  any 
consistent  results  that  are  statistically  significant.  The  lack  of  data  is  sometimes  a  deterrent  to 
regression  analysis.  The  Coast  Guard  uses  a  technique  that  pools  detailed  data  to  a  macro-level 
increasing  the  size  of  the  data  set  under  analysis.  By  using  the  pooling  technique,  the  effective 
data  set  swells  to  over  1,700  data  points.  The  data  increase  will  provide  a  more  robust  analysis 
to  determine  if  a  predictive  relationship  exists  with  statistically  significant  results. 
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Research  Approach  and  Scope 

This  research  paper  uses  a  quantitative  methodology  using  data  pooling  and  multiple 
regression  techniques.  Historical  aviation  fuel  consumption  data  and  a  multitude  of  potential 
explanatory  variables,  such  as  flying  hours,  sorties,  mission  type,  weapon  system  type,  and  major 
command  are  provided  by  the  AFCAA.  However,  the  AFCAA  database  only  captures  nine  data 
points  for  each  weapon  system  within  a  major  command.  Although  nine  data  points  is  sufficient 
to  conduct  regression  analysis,  this  research  seeks  to  determine  relationships  across  multiple 
weapon  systems  and  major  commands.  By  pooling  or  combining  all  major  commands  and 
weapon  system  data,  the  potential  exists  to  develop  one  macro  aviation  fuel  requirement  model 
based  on  known  explanatory  requirements  using  multiple  regression  analysis. 

Several  techniques  to  develop  a  macro-level  forecasting  model  for  aviation  fuel 
requirements  are  available.  In  Chapter  Two,  some  of  these  techniques  are  discussed.  However, 
this  research  narrows  the  scope  to  the  technique  of  pooled  time  series  analysis  in  an  effort  to 
determine  the  potential  application  to  predicting  aviation  fuel  requirements. 

Research  Benefits 

The  research  seeks  to  develop  a  macro-level  forecasting  model  for  USAF  aviation  fuel 
requirements.  The  model  will  either  replace  or  provide  a  cross-check  to  the  existing  method  that 
the  AFCAA  uses  to  predict  aviation  fuel  requirements.  The  model  will  also  serve  to  conduct 
alternative  fuel  comparison  studies  as  the  need  to  reduce  foreign  oil  dependency  increases. 

Chapter  Summary 

This  chapter  proposes  pooled  times  series  analysis  as  a  technique  to  develop  a  macro¬ 
level  model  to  forecast  USAF  aviation  fuel  requirements.  Chapter  Two  explores  the  United 
States  and  USAF  foreign  oil  dependency  and  vulnerability,  alternative  fuel  sources,  and  the 
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Coast  Guard’s  pooled  time  series  analysis  model  for  forecasting  ship  and  aviation  fuel 
requirements.  Chapter  Three  explains  the  methodology  used  to  develop  and  test  a  potential  Air 
Force  aviation  fuel  requirement  model  using  pooled  times  series  analysis.  Chapter  Four  presents 
the  results  of  the  model  development.  Chapter  Five  concludes  with  model  recommendations. 
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II.  Literature  Review 


Chapter  Overview 

The  dependeney  on  foreign  oil  plaees  the  United  States  and  USAF  in  a  vulnerable 
position  as  world  eompetition  inereases  for  a  global  finite  resource.  For  this  reason,  the  United 
States  is  searching  for  alternative  sources  to  fuel  the  economy  and  its  military  machine.  To 
better  understand  the  requirements  of  aviation  fuel  and  potential  alternative  fuel  source 
comparison  studies  this  research  will  investigate  existing  methods  or  techniques  to  forecast 
aviation  fuel  requirements.  Finally,  the  Coast  Guard’s  pooled  time  series  analysis  ship  and 
aviation  forecasting  model  is  examined  for  applicability  to  forecasting  Air  Force  aviation  fuel 
requirements. 

The  United  States  and  USAF  Dependency  and  Vulnerability  on  Foreign  Oil 

The  United  States  far  outpaces  the  world  in  oil  consumption,  consuming  over  25  percent 
(7.6  billion  barrels  per  year)  of  the  world’s  30  billion  barrels  of  oil  annually.  *  Without  oil, 
America’s  economy  and  military  machine  would  come  to  a  screeching  halt.  America  imports 
roughly  63  percent  of  its  oil.^  Foreign  dependency  on  a  high-demand  finite  resource  jeopardizes 
U.S.  national  security. 

In  his  1994  book.  The  Road  to  2015,  John  Peterson  predicted  United  States  dependence 
on  Middle  East  foreign  oil.  Table  1  shows  a  summary  of  the  top  oil  importers.  Of  particular 
concern  are  the  Organization  of  the  Petroleum  Exporting  Countries  (OPEC)  that  account  for 
almost  50%  of  the  oil  imports."^  Since  1989  United  States  oil  imports  have  steadily  increased,  a 
favorable  trend  for  the  OPEC  nations.^  The  majority  of  “world  oil  is  in  the  Middle  East, 
controlled  by  OPEC,  a  cartel  of  unfriendly,  unstable  regimes  that  already  exercise  too  much 
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control  over  the  world  oil  priees.”^  The  reliance  on  sueh  a  vital  resouree  to  ensure  national 


security  is  at  the  mercy  of  OPEC,  which  provides  a  staggering  30%  of  the  United  States  overall 
demand  for  oil/  However,  the  greater  threat  is  dependence  on  a  finite  resource. 


Table  1:  2008  Top  Importers  from  January — ^August^ 


Import 

Cummulative 

Barrels  In 

Top  Importers 

Percent 

%  Imports 

Thousands 

1  Canada 

18.6% 

18.6% 

592,199 

2  Suadi  Arabia* 

12.0% 

30.6% 

380,632 

3  Mexico 

10.1% 

40.7% 

320,789 

4  Venezueia* 

9.3% 

50.0% 

295,205 

5  Nigeria* 

8.2% 

58.1% 

260,287 

6  Iraq* 

5.2% 

63.3% 

164,767 

7  Aigeria* 

4.0% 

67.4% 

127,981 

8  Angoia* 

4.0% 

71.4% 

127,651 

9  Russia 

3.7% 

75.1% 

118,767 

10  Virgin  Isiands 

2.5% 

77.6% 

79,491 

11  Brazii 

1 .9% 

79.5% 

59,620 

12  United  Kingdom 

1 .7% 

81.2% 

53,312 

13  Ecuador* 

1 .7% 

82.8% 

52,675 

14  Coiombia 

1 .6% 

84.4% 

51,190 

15  Kuwait* 

1 .6% 

86.0% 

50,546 

Aii  others** 

14.0% 

100.0% 

444,843 

Totals 

100.0% 

3,179,955 

*OPEC  Nations 

**Non-OPEC  importers  excluding  Libya,  Indonesia,  and  Arab  Emirates 

Distribution 

Barrels  In 

Source  of  Imports 

Percent 

Thousands 

OPEC 

47% 

1,494,364 

Non-OPEC 

53% 

1,685,591 

Total 

100% 

3,179,955 

The  amount  of  oil  remaining  in  the  world  is  still  debated.  Although  there  is  no  definitive 


answer  to  “proven”  and  “unproven”  reserves  or  “peak”  production  timelines,  most  agree  that  oil 


is  a  finite  resouree  with  an  inereased  global  demand.  The  oil  industry  currently  discovers  less 


than  40  percent  in  new  oil  necessary  to  prevent  the  base  reserves  from  shirking.^  In  his  book. 


The  End  of  Oil,  Paul  Roberts  prediets  that  the  world  will  experience  a  peak  in  oil  production  in 


the  year  2016  based  upon  eurrent  trends  in  global  consumption  and  an  estimated  trillion  barrels 
of  remaining  oil.^*^  The  importanee  of  a  peak  is  that  produetion  drastically  declines.''  However, 
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the  greater  concern  is  that  non-OPEC  oil  is  likely  peak  before  OPEC  bringing  the  world  supply 
under  “the  control  of  a  cartel  with  a  history  of  rash  behavior  and  dubious  sympathy  for  the 
West.”  A  monumental  concern  for  U.S.  national  security  given  the  world’s  ever  increasing 
demand  for  oil. 

China,  a  distant  second  to  the  United  States,  accounts  for  only  7.9  percent  of  the  world’s 
consumption  or  less  than  a  third  (2.4  billion  barrels  per  year)  of  the  amount  consumed  by  the 
United  States.  With  a  population  roughly  four  and  a  half  times  larger  than  the  United  States 
and  an  accelerated  rate  of  industrialization,  China’s  demand  for  oil  is  projected  to  reach  5.8 
billion  barrels  per  year  by  2030.^"^  Other  rapidly  industrializing  nations,  like  India,  are 
experiencing  similar  growth  demands  for  oil.  As  the  global  demand  for  oil  increases,  the  rate  of 
exhausting  reserves  accelerates. 

The  Department  of  Defense  (DOD)  and  in  particular  the  USAE  is  highly  dependent  on 
oil.  Aviation  fuel  is  a  large  portion  of  the  Air  Eorce  Elying  Hour  Program  funding  requirement. 
In  fiscal  year  2007  the  Air  Eorce  consumed  over  two  and  a  half  billion  gallons  while  flying  over 
two  million  hours. Per  capita,  only  the  Virgin  Islands  and  the  Netherlands  Antilles  consume 
more  oil  than  the  USAE.'^  This  scale  of  consumption  requires  statistically  significant  estimates 
for  future  aviation  fuel  requirement. 

The  USAE  has  investigated  the  development  of  renewable  energy  sources  such  as  bio- 
fuels  as  an  alternative  to  the  non-renewable  hydrocarbon  sources.  Some  of  the  bio-fuels 
considered  as  potential  alternatives  include  ethanol,  biodiesel,  algae,  and  biobutanol.  However, 
an  analysis  of  alternative  fuel  sources  is  not  the  purpose  of  this  study.  This  research  will 
investigate  potential  models  to  forecast  the  aviation  fuel  baseline  requirement  by  using  multiple 
regression  techniques.  In  an  effort  to  reduce  foreign  oil  dependency,  a  predictive  model  will 
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help  the  Air  Foree  better  understand  the  baseline  requirement  necessary  for  effective  alternative 
fuel  source  comparison  studies. 

United  States  Energy  Independence  through  Alternative  Fuel  Sources 

The  capacity  of  the  United  States  to  become  energy  independent  and  still  meet  current 
increasing  demands  without  alternative  energy  sources  is  highly  unlikely.  The  United  States 
ranks  third  in  oil  production  with  21 .4  billion  barrels  of  proven  reserves.  However,  an 
exhaustion  of  reserves  would  occur  in  four  to  five  years  if  the  United  States  relied  solely  on 
indigenous  resources.  In  the  wake  of  constrained  budgets,  volatile  fuel  prices,  and  increased 

2 1 

oil  dependency  on  adversarial  regimes,  the  United  States  must  look  to  alternative  fuel  sources. 
Although  bio-fuels  such  as  ethanol,  biodiesel,  algae,  and  biobutanol  are  not  new  alternatives,  the 
production  capacity,  transportability,  stability,  and  engine  fuel  compatibility  challenges  have  not 
created  a  cost  benefit  to  hydrocarbons.  However,  when  alternative  fuel  technologies  satisfy 
USAF  criteria  the  necessity  to  better  understand  alternative  fuel  source  comparisons  to  the 
hydrocarbon  baseline  will  require  aviation  fuel  forecasting  models. 

Issues  when  Forecasting  the  USAF  Aviation  Fuel  Requirement 

The  AFCAA  determines  the  aviation  fuels  annual  requirement.  The  current  method  takes 
an  average  of  the  past  five  fiscal  years  at  the  major  command  weapon  system  code  level.  The 
process  is  time  consuming  and  very  data  intensive.  The  purpose  of  this  research  is  to  develop  a 
macro-level  mathematic  relationship  that  forecasts  aviation  fuel  requirements  at  the  total  Air 
Force  level.  This  research  employs  the  Coast  Guard’s  application  of  pooled  time  series  analysis 
to  determine  if  known  explanatory  variables  will  establish  a  macro-level  relationship  to  predict 
aviation  fuel. 
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The  AFCAA  conducted  a  similar  research  in  February  of  1998  by  developing  a  fuel 
consumption  cost  estimating  relationship  based  on  explanatory  variables  such  as  weight,  speed, 
engine  type,  mission,  and  others.  Due  to  the  scope  and  purpose  of  this  research  a  full  literature 
review  is  not  included.  However,  using  similar  explanatory  variables  and  applying  a  pooled  time 
series  methodology  is  worthy  of  future  research. 

Coast  Guard  Model  Using  Pooled  Time  Series  Analysis 

In  August  of  1999,  the  United  States  Coast  Guard  (USCG)  Headquarters  contracted  the 
Logistics  Management  Institute  (LMI)  to  develop  models  that  forecast  aircraft  and  ship  fuel 
requirements.^"^  The  models  that  LMI  developed  used  a  data  pooling  technique  with  linear 
regression  known  as  pooled  time  series  analysis.  The  fundamental  components  of  a  linear 
regression  model  are  the  intercept,  slope,  and  the  independent  explanatory  variable  that  explains 
the  dependent  variable.  This  research,  like  the  USCG,  investigates  independent  variables  like 
flying  hours  to  explain  the  fuel  consumption  or  dependent  variable.  When  data  observations  of 
the  independent  and  dependent  variables  occur  over  time  or  across  different  groups,  like  weapon 
system  codes,  pooling  the  time  series  data  is  a  common  model  building  technique.  The 
advantage  of  pooling  time  series  data  increases  the  number  of  observable  data  points  producing 
more  powerful  estimates.  The  power  comes  in  the  models  ability  to  accurately  estimate  fuel 
consumption  requirements  across  a  number  of  different  platforms. 

The  three  models  LMI  developed  produced  significant  explanatory  capability.  The 
aircraft  model  explains  99  percent  of  the  variation  of  fuel  consumption  using  flying  hours  as  the 
sole  predictor.  The  medium-  and  high-endurance  cutter  model  and  below-medium-endurance 
cutter  model  explained  87  and  90  percent  of  the  variation  of  fuel  consumption  respectively  using 
vessel  hour  operations  as  the  sole  predictor.  Although  the  LMI  study  declares  the  three  models 
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statistically  significant  stating  the  parameter  estimate  pass  the  t-tests,  the  researeh  paper  does  not 
inelude  the  test  results  for  the  assumptions  of  linear  regression.  This  researeh  will  apply  the 
same  pooled  time  series  analysis  employed  by  LMI  to  develop  a  forecasting  model  for  aviation 
fuel  requirements  for  the  USAF. 

Chapter  Summary 

This  ehapter  outlines  the  neeessity  for  a  maero-level  model  to  better  prediet  fuel 
requirements  and  alternative  fuel  source  comparisons.  Based  on  past  research,  the  Coast  Guard 
provides  a  methodology  to  develop  a  model  to  foreeast  aviation  fuel  requirements  using  pooled 
time  series  analysis.  The  pooling  of  data  teehnique  inereases  the  number  of  data  points.  The 
larger  data  set  ereates  a  more  robust  regression  analysis  to  better  understand  the  predietive  power 
of  potential  models. 
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III.  Methodology 


Chapter  Review 

The  Coast  Guard  uses  pooled  time  series  analysis  to  develop  mathematieal  relationships 
to  forecasts  aviation  and  ship  fuel  requirements.  This  research  seeks  to  develop  and  employ 
similar  mathematical  relationships  by  applying  the  pooled  time  series  analysis  to  forecast  Air 
Force  aviation  fuel  requirements.  The  chapter  explains  the  data  preparation  and  pooling 
technique,  introduces  the  potential  explanatory  variables  used  in  the  regression  analysis,  and 
describes  the  theoretical  tests  necessary  to  claim  a  statistically  significant  model.  The  chapter 
continues  by  explaining  the  methodology  used  to  validate  the  predictive  capability  of  a 
theoretically  sound  model.  Finally,  the  method  to  assess  the  risk  and  uncertainty  of  model 
predictions  is  explained. 

Preparing  and  Pooling  the  Data 

The  data  used  for  the  regression  analysis  is  provided  by  the  AFCAA.  The  composition  of 
the  historical  data  includes  nine  fiscal  years  of  both  numerical  and  categorical  predictors  that  are 
delineated  by  pre  and  post  9/11.  The  aviation  fuel  consumption  in  gallons,  flying  hours,  and 
sorties  are  the  three  numerical  predictors  and  the  categorical  predictors  includes  MAJCOM, 
weapon  system  code,  weapon  system  type,  and  mission  type.  The  three  numerical  predictors  are 
further  delineated  into  combat  or  training  fuel  consumption,  flying  hours,  and  sorties.  The 
complete  data  set  provides  2,404  data  points  for  analysis.  However,  some  of  the  data  points  are 
justifiably  removed  due  to  recording  error  or  incompleteness. 

The  final  dataset  contains  1,778  data  points  after  removing  records  that  did  not  have 
flying  hours,  gallons  consumed,  or  the  mission  was  unknown.  Several  hundred  of  the  data  points 
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had  no  recorded  PAA  and  five  did  not  have  the  number  of  sorties  recorded.  However,  because 
the  PAA  and  sorties  were  not  significant  predictors  those  data  points  remained  in  the  final 
dataset.  Upon  completion  of  the  dataset  preparation  the  method  of  pooling  time  series  analysis  is 
employed. 

The  term  “pooling  time  series  analysis”  refers  to  the  data  arrangement  and  the  analysis 
technique.  First,  “pooling”  is  the  process  of  combining  similar  data  into  one  dataset  to  increase 
the  number  of  observations  when  conducting  the  analysis.  Currently,  the  AFCAA  takes  an 
historical  average  of  the  aviation  fuel  consumed  by  a  particular  weapon  system  code  within  a 
particular  MAJCOM.  However,  nine  data  points  is  not  ideal  when  using  regression  analysis  to 
determine  statistically  significant  mathematical  relationships.  Although  many  of  the  weapon 
systems  at  the  MAJCOM  level  show  strong  relationship  between  gallon  consumed  and  flying 
hours  this  research  seeks  to  discover  a  macro-level  model  to  avoid  the  time  consuming  process 
of  developing  a  predictive  model  for  each  weapon  system  code  with  a  MAJCOM. 
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Figures  1  and  2  illustrate  the  pooling  data  technique  using  the  bomber  weapon  systems. 
Figure  1  shows  a  strong  relationship  between  gallons  consumed  and  flying  hours  for  the  B-IB  in 
Air  Combat  Command  (ACC).  Figure  2  shows  the  method  of  pooling  by  combining  all  of  the 
bombing  weapon  systems  across  all  of  the  MAJCOMs.  The  large  data  points  are  the  B-lBs  in 
ACC  and  the  remaining  data  points  represent  the  B-2As,  B-52s,  and  the  remaining  B-lBs  from 
other  applicable  MAJCOMs.  The  benefit  of  pooling  the  data  is  the  creation  of  one  macro-level 
model  that  forecasts  the  aviation  fuel  requirement  for  bombers  given  the  programmed  flying 
hours  for  any  given  fiscal  year.  The  increase  in  data  points  enhances  the  fidelity  of  the  statistical 
significance  and  the  potential  predictive  power  of  the  mathematical  relationship.  The  purpose  of 
this  research  is  to  pool  the  time  series  data  to  develop  a  macro-level  model  that  is  statistically 
significant  to  justify  the  use  of  the  model  to  forecast  aviation  fuel  requirements. 


Multiple  Regression  Analysis 

Multiple  regression  is  used  to  determine  if  there  is  a  mathematical  relationship  between 
gallons  of  aviation  fuel  consumed  and  possible  predictors  that  are  known  prior  to  forecasting  a 
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new  aviation  fuel  requirement.  To  build  the  fuel  requirement  regression  model,  five  eategorieal 
(nominal)  and  three  numerieal  (eontinuous)  predictor  variables  are  tested  for  significant 
relationships.  The  five  categorical  predictors  are  MAJCOM,  weapon  system  code,  weapon 
system  type,  mission  type,  and  pre  or  post  911  data  (see  appendix  A,  Table  7).  The  three 
numerical  predictors  are  the  flying  hours  by  MAJCOM  and  weapon  system  code,  the  number  of 
sorties  flown  by  MAJCOM  and  weapon  system  code,  and  the  primary  assigned  aircraft  (PAA)  or 
number  of  a  particular  weapon  system  code  with  a  MAJCOM.  Using  JMP  Statistical  Analysis 
Software,  mathematical  relationships  are  investigated  and  tested  for  theoretical  soundness  to 
forecast  aviation  fuel  requirements. 

Statistical  Significant  Tests 

To  test  for  the  statistical  reliability  of  potential  regression  models  an  analysis  is 
conducted  to  determine  if  any  influential  data  points  exist  that  bias  selected  explanatory  variables 
and  to  test  the  model  assumptions  for  normality,  constant  variance,  and  independence.  The  test 
for  possible  influential  data  points  is  achieved  by  plotting  Cook’s  D  influence  statistic  which 
indicates  observations  with  large  effects  on  parameter  estimates.  The  x-axis  labeled  “Rows”  is 
the  number  of  data  points  delineated  by  MAJCOM  and  weapon  system  code. 
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Figure  3:  Cook’s  D  Influential  Data  Points  Test  Example^"* 

-5C 

When  values  are  greater  than  0.5  the  observation  is  considered  influential.  Figure  3  displays  an 
ideal  example  of  Cook’s  D  influential  statistic  plotted  showing  no  outlying  data  points. 


Figure  4:  Residual  Normality  Test  Example^^ 

The  purpose  of  testing  for  normally  distributed  residuals  ensures  the  validity  of  the 
overall  F  and  t-tests.  The  F-test  indicates  that  the  overall  model  is  significant.  When  there  are 
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several  explanatory  variables,  the  t-test  indicates  the  significance  of  multiple  predictors. 
Additionally,  inferences  concerning  the  variability  of  model  parameters  hinge  upon  normally 
distributed  residuals.  To  determine  if  the  regression  model  residuals  are  normally  distributed  a 
goodness-of-fit  (GOF)  is  conducted.  The  residuals  are  normally  distributed  when  the  />-value  is 
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greater  than  0.05.  Figure  4  graphically  depicts  an  example  of  normally  distributed  residuals 
with  a />-value  greater  than  0.05. 

The  test  for  constant  variance  and  independence  is  determined  graphically  by  using  a 
scatter  plot  of  the  predicted  values  versus  the  residuals  values.  When  constant  variance  and 
independence  is  present  the  residuals  are  evenly  distributed  around  the  line  0  depicted  in  Figure 
5."^°  When  constant  variance  is  not  present  the  fidelity  of  the  predicted  values  is  compromised. 
Transforming  the  dependent  variable  is  a  potential  correction  for  constant  variance. 


Figure  5:  Example  of  Constant  Variance  in  the  Residual  by  Predicted  Plot"*^ 

Model  Validation 

To  validate  the  robustness  of  the  regression  models,  a  random  sample  of  20  percent  of  the 
data  is  excluded  from  the  model  development.  Once  the  model  is  developed  and  determined 
statistically  significant  the  remaining  20  percent  of  the  randomly  selected  data  is  used  to  test  the 
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predictive  capability  of  the  model.  The  regression  model  determines  what  explanatory  variables 
are  significant.  The  same  explanatory  variables  from  the  excluded  data  are  entered  into  the 
regression  model  to  produce  predictions.  A  successful  validation  is  achieved  if  the  model 
predictions  fall  within  a  95  percent  prediction  interval.  The  validation  will  demonstrate  that 
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there  is  not  an  over-fit  of  the  data  to  build  the  regression  model. 

Developing  Uncertainty  and  Risk  Analysis 

The  purpose  of  this  research  is  to  develop  a  regression  model  that  predicts  the  aviation 
fuel  requirement.  However,  the  usefulness  of  predictions  often  hinges  on  the  understanding  of 
the  uncertainty  and  risk  of  a  model’s  output.  Assuming  the  regression  model  passes  the  test  of 
normality  and  constant  variance,  the  mean  and  standard  deviation  of  the  prediction  is  the  basis 
for  understanding  the  uncertainty  and  risk.  In  this  case,  uncertainty  is  the  range  of  potential 
outcomes  across  a  normal  probability  distribution  for  any  one  observation  defined  by  the 
model’s  mean  and  standard  deviation.  The  distribution  of  uncertainty  helps  a  decision  maker 
better  understand  the  probability  of  potential  risks  or  the  probability  of  an  unfavorable  outcome. 
Any  given  prediction  of  a  linear  regression  model  is  the  mean  and  has  an  associated  standard 
deviation.  Using  Monte  Carlo  simulation,  the  model  mean  prediction  and  standard  deviation 
produce  a  theoretical  normal  distribution  that  will  quantify  the  uncertainty  and  risk  of  forecasting 
aviation  fuel  requirements. 

Chapter  Summary 

This  chapter  explains  the  proposed  methodology  to  predict  aviation  fuel  requirements.  A 
discussion  of  preparing  and  pooling  the  data  provides  the  background  to  understanding  the 
nature  of  the  dataset  that  is  used  for  regression  analysis.  The  tests  for  statistical  significance  are 
explained  to  ensure  the  fundamental  assumptions  of  linear  regression  are  met.  The  method  of 
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validating  the  forecasting  model  is  presented.  Finally,  an  explanation  of  the  process  to  assess  the 
uncertainty  and  risk  of  the  model  predictions  is  provided. 
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IV.  Results 


Chapter  Overview 

Chapter  Three  outlined  the  methodology  to  predict  aviation  fuel  requirements.  This 
chapter  presents  the  results  of  applying  pooled  times  series  analysis  to  develop  a  forecasting 
model  for  aviation  fuel  requirements.  First,  the  aviation  fuel  regression  models  are  displayed  and 
explained.  Second,  the  statistical  significant  tests  are  presented.  Finally,  the  model  validation 
results  are  discussed. 


Single  Macro-Level  Aviation  Fuel  Regression  Model 

The  regression  analysis  looked  at  several  potential  mathematical  relationship  broken  into 
single  and  sub  macro-level  models.  The  single  macro-level  model  used  80  percent  of  the  data  to 
develop  a  mathematical  relationship  and  the  remaining  20  percent  is  set  aside  for  model 
validation.  This  research  develops  both  single  and  sub  macro-level  models  for  comparison 
purposes.  Figure  6  shows  a  scatter  plot  of  the  data  used  to  develop  a  mathematical  relationship 
for  the  single  macro-level  model  in  terms  of  gallons  consumed  and  flying  hours. 
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The  scatter  plot  shows  that  there  is  no  linear  relationship  between  gallons  and  flying 
hours  alone.  However,  Figure  6  does  clearly  show  that  there  are  definite  sub  groups  that 
graphically  indicate  linear  relationships  which  are  discussed  later  in  the  chapter.  The  purpose  of 
this  section  of  the  study  is  to  develop  a  single  macro-level  model  by  introducing  additional 
predictor  variables  that  will  explain  actual  fuel  consumption  data.  Figure  7  graphically  illustrates 
the  least  squares  estimated  gallons  in  millions  by  the  model’s  predicted  gallons  in  millions.  The 
least  squares  estimated  gallons  are  arranged  closely  along  the  model’s  predicted  gallons 
regression  line,  demonstrating  that  our  regression  model  predicts  gallons  well.  However,  a 
closer  look  at  the  table  of  statistics  reveals  concerns  about  the  soundness  of  the  model. 


Figure  7:  Initial  Least  Squares  Estimated  Gallons  by  Predicted  Gallons  Plot'*'' 

First,  the  focus  of  the  table  of  statistics  is  the  adjusted  R  ,  F-test,  t-tests,  and  variance 
inflation  factor  (VIF).  The  adjusted  R  measures  the  model’s  capability  to  predict  the  gallons. 
The  adjusted  R  is  preferred  over  the  R  because  it  compares  across  models  with  different 
numbers  of  parameters  by  using  the  degrees  of  freedom  in  its  computation.'*^  The  F-test 
determines  the  overall  model  significance  and  the  t-test  determines  the  significance  of  each 
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explanatory  variable.  When  the />-values  are  less  than  0.05  the  model  or  individual  predictors  are 
considered  significant.  The  VIF  is  a  statistical  measurement  that  tests  for  multicollinearity,  or 
correlation  between  predictor  variables."^^  When  the  VIF  is  greater  than  or  equal  to  10, 
multicollinearity  may  exist  and  could  decrease  the  fidelity  of  any  given  point  prediction."^^ 


Table  2:  Initial  Single  Macro-Level  Table  of  Statistics'*^ 


Single  Macro-Level  Summary  of  Fit 

RSquare 

0.89304 

RSquare  Adj 

0.89251 

Root  Mean  Square  Error 

10.48406 

Mean  of  Response 

12.01761 

Observations  (or  Sum  Wgts) 

1422 

Analysis  of  Variance  (ANOVA) 

Source 

DF 

E  Squares 

p Square 

F  Ratio 

Model 

7 

1297698.2 

185385 

1686.618 

Error 

1414 

155420.5 

110 

Prob  >  F 

C.  Total 

1421 

1453118.6 

0.0000 

Parameter  Estimates 

Term 

Estimate 

Std  Error 

t  Ratio 

Prob>|t| 

VIF 

Intercept 

-1.82307 

0.33588 

-5.43 

<.0001 

- 

Combat  FH(K) 

3.25529 

0.03900 

83.47 

0.0000 

1.21605 

Training  FH  (K) 

2.23308 

0.07924 

28.18 

<.0001 

22.60263 

Total  Sorties 

-0.00238 

0.00010 

-22.96 

<.0001 

20.24559 

C-130H 

-18.39458 

1.83450 

-10.03 

<.0001 

1.04525 

F-15C 

14.56145 

1.58155 

9.21 

<.0001 

1.01293 

T-IA 

-91.29420 

4.75890 

-19.18 

<.0001 

1.84264 

Bombers/Tankers 

13.58475 

0.96535 

14.07 

<.0001 

1.13089 

Table  2  presents  the  results  of  a  potential  model.  Combat  and  training  flying  hours,  C- 
130H,  F-15C,  T-IA,  and  the  group  Bombers/Tankers  are  the  significant  predictors.  The 
Bomber/Tanker  group  combines  the  B-IA,  B-2B,  B-52H,  C-141B,  C-17A,  C-5A/B/C,  and  the 
KC-IOA.  The  adjusted  R  of  0.89  indicates  that  the  model  predicts  gallons  well  and  is  not  overly 
affected  by  the  seven  predictor  variables  selected.  The  F  and  t-tests  all  show  j!?-values  that  are 
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significant  indicating  that  the  overall  model  and  individual  predictor  variables  are  statistieally 
significant.  However,  the  training  fiying  hours  and  total  sorties  predictors  both  report  VIF  values 
well  above  five  suggesting  multicollinerity  (see  Appendix  C  for  predictor  variables  correlation 
matrix).  Beeause  the  training  flying  hours  explain  more  of  the  variation,  the  total  sorties 
predictor  variable  is  eliminated  from  the  model.  Figure  8  shows  the  new  least  squares  estimated 
gallons  in  millions  by  the  predicted  gallons  in  millions  without  including  total  sorties. 


Figure  8:  Final  Least  Squares  Estimated  Gallons  by  Predicted  Gallons  Plot'*’ 

After  numerous  iterations,  no  other  predictor  variables  or  interaetions  improve  the 
adjusted  R  of  0.853  and  maintain  significant  F  and  t-tests.  Table  3  reports  the  new  table  of 
statistics  and  shows  that  the  VIF  statistic  for  all  six  predictor  variables  are  below  five  indieating 
that  multieollinearity  is  no  longer  an  issue.  The  /?-value  is  greater  than  0.05  for  the  t-test 
indieating  that  the  intercept  is  insignificant.  Therefore  the  final  model  is  as  follows: 
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Gallons^iiii^r>s  = 

3.18iCombat  Flying  +  0A7(Training  Flying  - 

20.37(C130//)  +  12.33(F15C)  -  28.33(7’L4)  +  18.51(Bomber  &  Tankers) 


Table  3:  Final  Single  Macro-Level  Table  of  Statistics^® 


Single  Macro-Level  Summary  of  Fit 

RSquare 

0.85316 

RSquare  Adj 

0.85254 

Root  Mean  Square  Error 

12.28001 

Mean  of  Response 

12.01761 

Observations  (or  Sum  Wgts) 

1422 

Analysis  of  Variance  (ANOVA) 


Source 

DF 

Z  of  Squares 

p Square 

F  Ratio 

Model 

6 

1239738.6 

206623 

1370.192 

Error 

1415 

213380.0 

151 

Prob  >  F 

C.  Total 

1421 

1453118.6 

0.0000 

Parameter  Estimates 

Term 

Estimate 

Std  Error 

t  Ratio 

Prob>|t| 

VI F 

Intercept 

-0.37809 

0.38645 

-0.98 

0.3280 

- 

Combat  FH(K) 

3.18419 

0.04554 

69.92 

0.0000 

1.2083829 

Training  FH  (K) 

0.46814 

0.02261 

20.7 

<.0001 

1.3414849 

C-130H 

-20.37252 

2.14638 

-9.49 

<.0001 

1.0429436 

F-15C 

12.33265 

1.84898 

6.67 

<.0001 

1.0091182 

T-IA 

-28.32668 

4.55575 

-6.22 

<.0001 

1.2308578 

Bombers_Tankers 

18.50742 

1.10248 

16.79 

<.0001 

1.0751304 

The  categorical  predictors  for  the  model  are  employed  by  inputting  the  counted  number 
of  MAJCOM  representation  by  weapon  system  code  (C-130H,  F-15C,  T-1 A  and 
Bombers/Tankers)  that  have  programmed  flying  hours.  The  Bombers/Tankers  categorical 
variable  is  still  employed  even  if  one  of  the  weapon  systems  (B-IA,  B-2B,  B-52H,  C-141B,  C- 
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17A,  C-5A/B/C,  and  KC-lOA)  has  no  flying  hour  representation.  For  example,  if  the  C-141B  no 
longer  has  programmed  flying  hours  but  the  other  flve  WSCs  are  all  represented  by  three 
MAJCOMs,  then  the  input  for  the  Bombers/Tankers  variable  is  15.  Before  elaiming  the  model 
useful  to  forecast  the  Air  Force  aviation  fuel  requirement,  several  diagnostics  test  are  performed 
to  ensure  the  assumptions  of  multiple  linear  regression  are  met. 

Single  Macro-Level  Model  Statistical  Signiflcant  Tests 

The  regression  model  is  sound  if  influential  data  points  do  not  bias  the  selected 
explanatory  variables  and  the  tests  for  normality,  constant  variance,  and  independence  are 
satisfled.  Figure  9  displays  the  Cook’s  D  influential  statistic  values  on  an  overlay  plot  and 
reveals  three  influential  data  points.  The  2006  and  2007  C-17A  in  Air  Mobility  Command 
(AMC)  and  the  2000  T-37B  in  the  Air  Education  and  Training  Command  (AETC)  are  the  three 
influential  data  points.  Although  statistically  influential,  there  are  no  logical  reasons  to  exclude 
the  three  data  points. 


Figure  9:  Single  Macro-Level  Model  Test  for  Influential  Data  Points^^ 
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Figure  10  shows  the  gallons  to  flying  hour  relationship  for  the  C-17A.  The  stars 
represent  the  2006  and  2007  data  points.  There  is  no  indication  that  the  two  C-17A  data  points 
are  significantly  different  than  the  rest  of  the  C-17A  data  with  similar  flying  hours.  Figure  1 1 
shows  that  the  C-17A  from  other  MAJCOMs  also  follows  along  the  same  trend  line. 


Figure  10:  C-17A  Gallons  to  Flying  Hour  Relationship  for  AMC^^ 


Figure  11:  C-17A  Gallons  to  Flying  Hours  Relationship  across  the  Air  Force^^ 
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The  removal  of  data  requires  a  sound  explanation  even  if  the  Cook’s  D  test  indicates 
influential  data.  However,  both  Figure  10  and  1 1  illustrate  that  potential  data  entry  errors  are  not 
plausible.  Figure  12  shows  a  nearly  perfect  relationship  between  gallons  and  flying  hours  for  the 
T-37B,  indicating  that  the  outliers  are  likely  a  result  of  flying  more  hours  than  the  typical 
weapon  system  in  the  data.  To  remove  the  data  would  decrease  the  ability  to  effectively  forecast 
the  C-17A  and  T-37B  fuel  requirement. 


Figure  12:  T-37B  Gallons  to  Flying  Hours  Relationship  for  AETC^'* 

The  test  for  normality  is  determined  by  fitting  a  normal  distribution  about  the  residuals 
from  the  regression  model.  Figure  13  displays  the  residual  distribution  and  the  fitted  normal 
function.  The  residuals  are  normally  distributed  when  the  />-value  is  greater  than  0.05.  The p- 
value  of  0.00  indicates  that  the  residuals  are  not  normally  distributed. 


25 


.  --  -.i  ------ 

Goodness-of-Fit  Test 

Shapiro-Wilk  W  Test 

W  Prob<W 

0.761697  0.0000 

-75  -50  -25  C 

25  50  75  100 

Figure  13:  Single  Macro-Level  Model  Normality  Test^^ 


The  natural  log  transformation  is  often  a  solution  to  solving  normality  violations  if  a 
curve  linear  relationship  exists.  However,  no  curve  linear  relationship  is  evident  in  the  data,  thus 
the  transformation  did  not  correct  the  failure  of  normality.  Although  the  linear  regression  model 
predicts  a  point  estimate  for  gallons  well,  the  point  estimate  variation  inferences  are  based  upon 
the  assumptions  of  normality  and  constant  variance.  Thus,  the  range  and  probabilities  associated 
with  the  variation  of  the  model’s  prediction  are  not  valid. 

The  tests  for  constant  variance  and  independence  are  based  upon  an  objective  graphical 
view  of  the  model  residuals  by  the  predicted  gallons  plot.  The  visual  conclusion  is  that  the 
assumption  of  constant  variance  and  independence  both  fail.  Figure  14  shows  that  the  values  for 
the  residuals  by  predicted  gallons  are  closely  massed  together  with  the  minority  farming  out. 
When  constant  variance  and  independence  are  present,  the  residuals  are  evenly  distributed 
around  the  line  0.  The  failure  of  all  three  assumptions  indicates  that  linear  regression  is  not  the 
model  to  predict  aviation  fuel  requirements  at  the  macro-level.  However,  the  validation  of  the 
model’s  predictive  capability  is  assessed  and  the  potential  for  sub  macro-level  models  are 
analyzed. 
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Figure  14:  Single  Macro-Level  Model  Test  for  Constant  Variance^^ 

Single  Macro-Level  Aviation  Fuel  Regression  Model  Validation 

To  validate  the  prediction  capability  of  the  single  macro-level  regression  model,  a 
random  sample  of  20  percent  of  the  data  was  excluded  from  the  original  model  development. 

The  excluded  data  is  fed  into  the  regression  model  to  compare  the  predictions  to  the  actual 
gallons  consumed.  However,  this  validation  is  fundamentally  flawed  because  the  underlying 
assumptions  of  linear  regression  are  not  true.  Thus,  the  95%  prediction  or  confidence  intervals 
are  theoretically  faulty. 

The  difference  between  a  prediction  and  confidence  interval  is  important  to  understand. 
The  confidence  interval  is  the  variation  associated  to  the  model’s  linear  regression  line.  The 
prediction  interval  is  the  variation  associated  with  any  given  predicted  point  estimate.  The 
predictability  of  any  single  point  is  far  more  uncertain  than  the  fitted  regression  line.  For  this 
reason,  the  prediction  interval  is  always  wider  than  the  confidence  interval.  Typically,  validation 
tests  use  a  prediction  interval  because  the  focus  is  on  individual  points  estimates. 

The  results  are  impressive  but  deceiving.  Of  the  356  point  estimates,  100%  fell  within  a 
95  percent  confidence  interval  and  well  within  the  95  percent  prediction  interval.  At  the  macro- 
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level,  the  model  predicted  a  requirement  of  4,428.27  million  gallons  versus  the  actual 
consumption  of  4,486.18  million  gallons,  a  difference  of  57.91  million  gallons,  or  only  1.29 
percent.  The  lower  and  upper  confidence  levels  are  3,998.79  and  4,857.75  millions  of  gallons 
respectively.  The  range  of  uncertainty  defined  by  the  95  percent  confidence  interval  (three 
standard  deviations)  is  856.96  millions  of  gallons  or  plus  or  minus  9.7  percent  of  the  prediction. 

The  95  percent  prediction  interval  reveals  the  evidence  of  extreme  or  influential  data 
points  with  a  lower  and  upper  bound  of  a  negative  4,166.49  and  13,023.03  million  gallons 
respectively.  This  equates  to  a  range  of  17,189  millions  of  gallons  or  plus  or  minus  194.1 
percent  of  the  prediction.  The  range  of  uncertainty  is  unrealistic  and  meaningless  rendering  a 
lack  of  confidence  in  the  point  estimate.  The  lower  bound  reveals  the  unrealistic  nature  of  the 
model  reporting  a  negative  requirement  for  aviation  fuel.  Although  the  results  are  impressive, 
the  single  macro-level  model  application  is  discouraged  without  caution  or  understanding  of  the 
underlying  statistics.  For  this  reason,  the  research  investigates  potential  sub  macro-level  models. 

Sub  Macro-Level  Aviation  Fuel  Regression  Models 

The  sub  macro-level  research  attempts  to  group  the  data  into  like  pools  to  alleviate  the 
impact  of  influential  data  points  and  to  satisfy  the  assumptions  of  linear  regression.  Figure  15 
displays  the  same  scatter  plot  shown  in  Figure  6,  but  identifies  like  sub-pooled  groups  of  data. 
The  same  methodology  is  employed  to  each  sub  macro-level  model  and  tested  for  the  same 
underlying  assumptions  of  linear  regression. 
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Figure  15:  Scatter  Plot  Sub  Macro-Level  Pooled  Data^’ 


The  Bomber  and  Tanker  pooled  group  (B-IA,  B-2B,  B-52H,  C-141B,  C-17A,  C-5A/B/C, 
and  KC-lOA)  is  the  first  set  of  data  analyzed  for  a  linear  relationship.  Figure  16  displays  the 
relationship  between  gallons  consumed  and  flying  hours  showing  an  initial  concern  with  outliers, 
the  massing  of  data  at  the  lower  gallons  consumed,  and  the  variation  increasing  as  more  hours 
are  flown.  The  same  concerns  are  prevalent  in  the  single  macro-level  model.  However,  the  test 
for  a  linear  relationship  and  the  theoretical  assumptions  are  conducted  to  determine  if  the  sub- 
level  model  is  statistically  significant. 
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Figure  16:  Scatter  Plot  of  the  Bomber  and  Tanker  Group^ 


Figure  17  reports  the  least  squares  estimated  gallons  by  the  predieted  gallons  showing  a 
strong  relationship  between  gallons  and  flying  hours  with  an  adjusted  R  of  0.986.  Flying  hours 
is  the  only  signifieant  explanatory  variable  when  predicting  gallons  consumed  explaining 
roughly  99  percent  of  the  variation.  The  final  Bomber  and  Tanker  model  is  displayed  as: 

=  2.54  +  2.75{Flying 
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Figure  17:  Bomber/Tanker  Least  Squares  Estimated  Gallons  by  Predicted  Gallons  Plot^’ 

Table  4  displays  the  table  of  statisties  with  aeeeptable  an  F-test  with  a  />-value  less  than 
0.05  and  the  VIF  less  than  five.  Although  the  linear  relationship  is  extremely  strong,  the 
theoretieal  assumptions  of  linear  regression  are  tested  for  model  soundness.  The  tests  for 
influential  data  points,  constant  variance,  independence,  and  normality  follow. 


Table  4:  Bomber/Tanker  Table  of  Statistics^*’ 


Single  Macro-Level  Summary  of  Fit 

RSquare 

0.98567 

RSquare  Adj 

0.98557 

Root  Mean  Square  Error 

9.61852 

Mean  of  Response 

46.02283 

Observations  (or  Sum  Wgts) 

149 

Analysis  of  Variance  (ANOVA) 

Source 

DF 

E  of  Squares 

p Square 

F  Ratio 

Model 

1 

935233.8 

935234 

10108.9 

Error 

147 

13599.8 

93 

Prob  >  F 

C.  Total 

148 

948833.7 

<0.0001 

Parameter  Estimates 

Term 

Estimate 

Std  Error 

t  Ratio 

Prob>|t| 

VIF 

Intercept 

2.5394719 

0.898863 

2.83 

0.0054 

- 

Flying  Hours  (K) 

2.7548227 

0.027399 

100.54 

<0.0001 

1.0000 
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Sub  Macro-Level  Model  Statistical  Significant  Tests 


Figure  18  reveals  that  the  2002  and  2003  C-17A  in  AMC  are  the  two  influential  data 
points  using  Cook’s  D  influential  statistic  values  on  an  overlay  plot.  Although  statistically 
influential,  there  are  no  logical  reasons  to  exclude  the  two  data  points.  The  C-17A  data  is 
influential  because  of  the  higher  flying  hours  and  gallons  consumed  compared  to  the  rest  of  the 
data  set. 


When  the  influential  data  points  are  removed  Table  5  shows  insignificant  change  in  the 
linear  relationship  or  the  parameter  estimates.  A  significant  improvement  in  the  adjusted  R 
would  indicate  that  the  C-17A  data  is  causing  the  “lone  ranger  effect”  or  regressing  two 
significantly  separate  clusters  of  data.  Essentially,  regressing  two  different  clusters  of  data  is  like 
regressing  two  data  points.  However,  the  adjusted  R  drops  2.3  percent  from  0.986  to  0.963 
indicating  that  the  C-17A  data  points  have  a  small  improvement  effect  on  the  linear  relationship. 
The  initial  parameter  estimate  of  2.755  only  drops  to  2.749  when  the  two  C-17A  data  point  are 
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removed.  The  parameter  estimates  are  virtually  the  same,  indicating  that  the  C-17A  data  falls  on 
the  same  regression  line  as  the  majority  of  the  data.  Thus,  the  C-17A  data  should  remain  in  the 
dataset. 

Table  5:  Bomber/Tanker  Table  of  Statistics  Excluding  Influential  Data  Points^^ 


Single  Macro-Level  Summary  of  Fit 

RSquare 

0.96375 

RSquare  Adj 

0.96345 

Root  Mean  Square  Error 

9.23697 

Mean  of  Response 

34.06246 

Observations  (or  Sum  Wgts) 

144 

Parameter  Estimates 

Term 

Estimate 

Std  Error 

t  Ratio 

Prob>|t| 

VI F 

Intercept 

2.5639255 

0.924818 

2.77 

0.0063 

- 

Flying  Hours  (K) 

2.7492339 

0.044742 

61.45 

<0.0001 

1.0000 

The  test  for  normality  is  determined  by  fitting  a  normal  distribution  about  the  residuals 
from  the  Bomber  and  Tanker  regression  model.  The  residuals  are  normally  distributed  when  the 
/?-value  is  greater  than  0.05.  The  />-value  of  0.00  indicates  that  the  residuals  are  not  normal 
distributed  displayed  in  Figure  19. 


Figure  19:  Bomber/Tanker  Macro-Level  Model  Normality  Test^^ 
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Based  upon  an  objective  graphical  view  of  the  Bomber  and  Tanker  model  residuals  by 
the  predicted  gallons  plot  in  Figure  20,  the  visual  conclusion  is  that  the  assumption  of  constant 
variance  and  independence  both  fail.  The  residuals  should  follow  an  even  distribution  about  the 
line  0  across  both  the  x  and  y  axis.  However,  Figure  20  clearly  shows  the  majority  of  the  data  is 
gathered  at  the  lower  end  of  the  gallons  predicted  scale  and  then  fans  out  abruptly.  Like  the 
single  macro-level  model,  the  sub  macro-level  model  does  not  solve  the  linear  regression 
assumption  failures  of  normality,  constant  variance,  and  independence.  Therefore,  the  variation 
inferences  derived  from  linear  regression  are  faulty.  However,  the  linear  regression  model  is 
highly  accurate  at  predicting  gallons  consumed  as  shown  in  the  Bomber  and  Tanker  model 
validation. 


Figure  20:  Bomber/Tanker  Model  Test  for  Constant  Variance^'^ 

Sub  Macro-Level  Aviation  Fuel  Regression  Model  Validations 

The  same  single  macro-level  model  methodology  is  employed  to  validate  the  prediction 
capability  of  the  Bomber  and  Tanker  sub  macro-level  regression  model.  Like  the  single  macro¬ 
level  model,  validation  is  fundamentally  flawed  because  the  underlying  assumptions  of  linear 
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regression  are  not  true.  However,  the  results  are  provided  to  show  the  predictive  capability  of 
the  Bomber  and  Tanker  model. 

Again,  the  results  are  impressive  but  deceiving.  Of  the  42  point  estimates,  100%  fell 
within  a  95  percent  confidence  interval  and  well  within  the  95  percent  prediction  interval.  At  the 
macro-level,  the  model  predicted  a  requirement  of  1,732.03  million  gallons  versus  the  actual 
consumption  of  1,817.47  million  gallons,  a  difference  of  85.44  million  gallons  or  only  4.70 
percent.  The  lower  and  upper  confidence  levels  are  1,655.14  and  1,808.91  millions  of  gallons 
respectively.  The  range  of  uncertainty  defined  by  the  95  percent  confidence  interval  (three 
standard  deviations)  is  153.77  millions  of  gallons  or  plus  or  minus  4.4  percent  of  the  prediction. 

The  95  percent  prediction  interval  reveals  the  evidence  of  extreme  or  influential  data 
points  with  a  lower  and  upper  bound  of  929.77  and  2,534.29  million  gallons  respectively.  This 
equates  to  a  range  of  1,604.51  millions  of  gallons  or  plus  or  minus  46.3  percent  of  the  prediction. 
The  range  of  uncertainty  is  unrealistic  and  meaningless  indicating  a  lack  of  confidence  in  point 
estimate  predictions.  Although  the  linear  relationship  is  impressive,  the  Bomber  and  Tanker  sub 
macro-level  model  application  is  discouraged  without  caution  or  understanding  of  the  underlying 
statistics. 

Table  6  reports  a  summarized  table  of  the  six  sub  macro-level  models  and  the  single 
macro-level  model.  The  remaining  five  sub  macro-level  models  have  strong  linear  relationship 
but  fail  the  theoretical  assumptions  of  linear  regression.  Appendix  B  displays  the  graphical 
linear  regression  assumption  test  results  of  all  six  sub  macro-level  models.  The  single  macro¬ 
level  model  predicted  the  actual  aviation  fuel  gallons  consumed  better  than  the  summation  of  the 
six  sub  macro-level  models.  However,  the  variation  of  the  single  macro-level  model  is  over  four 
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times  wider  than  the  sub  macro-level  model.  Both  models  are  potentially  useful  to  provide 
secondary  estimates  to  aviation  fuel  requirements. 


Table  6:  Summary  of  Linear  Regression  Models 

Description  of  Data 
Number  of  Observations  Withheld 
Actual  Aviation  Fuel  Consumed 
Predicted  Aviation  Fuel  Consumed 
Actual  -  Predicted  Gallons 
Percent  Difference  of  Actual  and  Predicted 
Lower  Bound  of  95%  Confidence  Interval  (Cl) 

Upper  Bound  of  95%  Cl 
95%  Cl  Range 
95%  Cl  *!-  3o 

Lower  Bound  of  95%  Prediction  Interval  (PI) 

Upper  Bound  of  95%  PI 
95%  PI  Range 
95%  PI  *!-  30 

Regression  Statistics 

Ajusted  R* 

Variance  Inflation  Factor  (VF) 

F-test 
f-test 

Cook's  D  Test  for  Infuential  Data  Points 
Test  for  Normality 
Constant  Variance/Independence 
Number  of  Observations 


Weapon  System  Code  (WSC)  Breakout  for  each  Sub  Macro-Level  Model: 

1.  Bomber  and  Tanker  Sub  Macro-Level  Model  (B-1A,  B-2B,  B-52H,  C-141B/C,  C-17A,  C-5A/B/C,  KC-10A) 

2.  F-16  and  C-130  Sub  Macro-Level  Model  (AC-130H/U,  C-130E/H/J,  EC-130EAI,  F-117A,  F-16A/B/CA5.  HC-130N/P,  LC-130H, 
MC-130E/H/P,  NC-130H,  TC-130H,  WC-130H) 

3.  F-15  and  C-135  Sub  Macro-Level  Model  (C-12(VF/J,  C-135B/C^ ,  C-20A/B,  C-22B,  C-26A/B,  C-32A/B,  C-37A.  C-38A,  C-40B/C, 
C-9A;C,  E-3B/C,  E-4B,  E-SC,  EC-135KflJ.  F-15A/B/CAD/E.  F-22A,  KC-1 350/Eff!/T,  NKC-135E,  OC-135B,  RC-135U/V/W,  TC-135S, 
TE-SA,  VC-25A,  WC-135C) 

4.  T-6  Sub  Macro-Level  Model  (T-6A) 

5.  Hek)  and  UAS  Sub  Macro-Level  Model  (C-21A,  HH-60G,  MH-53M,  MQ-9A,  RQ-9A,  T-37B,  U-2S,  UH-1NA/) 

6.  A-10  Sub  Macro-Level  Model  (A-10A,  OA-10A,  AT-38B,  F-4F,  T-38A/C,  T-39B) 

Other  Notes: 

7.  Influencial  data  point  causes  only  slight  change  in  linear  regression  nnodefs  slope  and  intercept 

8.  Influencial  data  point  causes  only  slight  change  in  linear  regression  modefs  sbpe  and  intercept 

9.  Influencial  data  point  causes  only  slight  change  in  linear  regression  modefs  slope  and  intercept 

10.  Influential  data  produces  "lone  ranger"  affect  potentialy  changing  the  linear  regression  model  slope  and  intercept 

1 1 .  Influencial  data  point  causes  only  slight  change  in  linear  regression  modefs  slope  and  intercept 

12.  Influencial  data  point  causes  only  slight  change  in  linear  regression  modefs  slope  and  intercept 

The  Six  Sub  Macro-Level  Linear  Regression  Models  (Use  WSC  breakout  flying  hour  totals) 

1.  GallonsM*ons  =  2.5394719  +  2.7548227  (Flying  HoursTNxjsands) 

2.  GallonsMawis  =  -0.161457  +  0.8335911  (Flying  HourSThwjsands) 

3.  GallonsMfcns  =  -0.29016  +  1.7213995  (Flying  Hoursihousands) 

4.  GallonsM*o»is  =  -0.002574  +  0.0652076  (Flying  HoursT>Kxjsands) 

5.  GallonsM*ons  =  -0.077913  +  0.1846358  (Flying  Hours-rhousands) 

6.  GallonsM*o«is  =  0  9833719  +  0.419244  (Flying  Hoursniousands) 
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Model' 

ModeF 

Model* 

Model* 

Model* 

Model* 

Macro 
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The  failure  of  the  theoretieal  assumptions  indieates  that  linear  regression  is  not  the  ideal 
method  to  forecast  aviation  fuel  requirements.  Although  the  sub  and  single  macro-level  models 
predict  the  actual  gallons  consumed  well,  the  forecasted  point  estimate  is  highly  uncertain  due  to 
the  failure  of  fundamental  linear  regression  assumptions.  Further  analysis  was  conducted  to 
separate  the  data  into  categories  according  to  the  number  of  flying  hours.  This  produced 
favorable  results  for  weapon  systems  that  flew  over  10  million  hours,  producing  a  statistically 
significant  model  that  met  linear  regression  assumptions.  However,  the  model  proved 
incomplete  as  the  lower  flying  hour  data  points  did  not  realize  statistically  significant  models  or 
pass  the  linear  regression  assumptions.  Thus,  pooling  time  series  data  to  employ  linear  multiple 
regression  is  not  the  preferred  model  building  method  to  forecast  aviation  fuel  requirements. 

Chapter  Summary 

This  chapter  provides  the  results  of  applying  the  methodology  to  1,778  pooled  data  points 
of  gallons  consumed  by  weapons  systems.  The  methodology  is  applied  first  to  a  single  macro¬ 
level  model  which  perfectly  predicts  gallons  within  a  95  percent  confidence  interval.  However, 
the  underlying  assumptions  of  the  single  macro-level  linear  regression  model  are  faulty.  For  this 
reason,  the  data  is  divided  into  sub-pooled  sets  and  the  methodology  is  reapplied.  The  sub 
macro-level  models  also  perfectly  predict  gallons  within  a  95  percent  confidence  interval. 
Unfortunately,  the  sub  macro-level  model  also  suffers  from  the  same  shortcomings,  failing  the 
tests  of  normality,  constant  variance,  and  independence.  Although  both  models  prove  valid  as 
capable  prediction  models  based  upon  a  95  percent  prediction  and  confidence  interval,  the 
statistics  that  provide  the  basis  for  the  variation  calculations  are  not  founded  upon  sound  linear 
regression  assumptions.  Thus,  the  application  of  the  single  and  sub  macro-level  linear 
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relationship  models  applieation  is  discouraged  without  caution  or  understanding  of  the 
underlying  statistics.  Chapter  Five  summarizes  and  concludes  this  research  effort. 
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V.  Conclusion 


Chapter  Overview 

This  ehapter  summarizes  the  findings  and  eonelusions  of  developing  a  model  to  foreeast 
aviation  fuel  requirements  using  pooled  times  series  analysis.  Chapters  One,  Two,  and  Three  are 
summarized  and  a  summary  of  the  results  of  Chapter  Four  are  presented.  Finally,  the  limitations 
of  this  researeh  are  presented  and  reeommendations  are  provided  for  future  researeh  efforts. 

Researeh  Summary 

Chapter  One  introduees  the  AFCAA  search  for  a  macro-level  model  that  will  forecast  the 
United  States  Air  Force  (USAF)  aviation  fuel  requirement.  The  AFCAA  would  like  a  macro¬ 
level  model  that  will  provide  a  cross-check  to  current  aviation  fuel  estimates  or  potentially 
replace  the  current  technique  employed.  In  the  wake  of  volatile  aviation  fuel  prices  and  the 
USAF’s  dependency  on  foreign  oil,  a  macro-level  model  will  provide  the  AFCAA  with  a 
forecasting  model  to  conduct  alternative  fuel  source  comparisons.  The  chapter  concludes  with 
the  AFCAA  desire  for  this  research  to  determine  if  pooled  time  series  analysis  can  develop  a 
macro-level  model  to  forecast  the  baseline  Air  Force  aviation  fuel  requirement  for  alternative 
fuel  source  comparison  studies. 

Chapter  Two  describes  the  United  States  and  USAF’s  dependency  on  foreign  oil  and  the 
vulnerabilities  associated  when  competing  for  a  global  finite  resource.  A  brief  summary  of 
potential  alternative  fuel  sources  are  reviewed.  The  chapter  then  focuses  on  existing  methods  or 
techniques  to  forecast  aviation  fuel  requirements.  Finally,  the  Coast  Guard’s  pooled  time  series 
analysis  ship  and  aviation  forecasting  model  is  examined  for  applicability  to  forecasting  Air 
Force  aviation  fuel  requirements. 
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Chapter  Three  deseribes  the  pooled  time  series  analysis  methodology  used  by  the  Coast 
Guard  to  develop  mathematieal  relationships  to  foreeasts  aviation  and  ship  fuel  requirements. 

An  explanation  of  the  data  preparation  and  pooling  teehnique  is  provided.  The  ehapter 
introduees  the  potential  explanatory  variables  used  in  the  regression  analysis,  and  deseribes  the 
theoretieal  tests  neeessary  to  elaim  a  statistieally  signifieant  model.  The  ehapter  eontinues  by 
explaining  the  methodology  used  to  validate  the  predictive  capability  of  a  theoretically  sound 
model.  Finally,  the  method  to  assess  the  risk  and  uncertainty  of  model  predictions  is  explained. 

Chapter  Four  presents  the  results  of  the  pooled  times  series  analysis  methodology  applied 
to  1,778  pooled  data  points  of  gallons  consumed  by  weapons  systems.  Two  models  are 
developed  and  explained.  The  first  is  the  single  macro-level  model  which  perfectly  predicts 
gallons  within  a  95  percent  confidence  interval.  The  second  is  the  sub  macro-level  model  which 
also  perfectly  predicts  gallons  within  a  95  percent  confidence  interval.  The  chapter  explains  that 
both  models  predicted  actual  historical  consumption  well,  however,  both  model’s  failed  the 
underlying  assumptions  of  linear  regression.  For  this  reason,  the  single  and  sub  macro-level 
linear  relationship  models  application  is  discouraged  without  caution  or  understanding  of  the 
underlying  statistics. 

Model  Application  and  Limitations 

The  application  of  either  the  single  or  sub  macro-level  model  to  forecast  aviation  fuel 
requirements  is  discouraged.  Both  models  suffer  from  the  same  statistical  limitations  in  that  the 
underlying  assumptions  of  linear  regression  are  violated.  However,  applying  the  models  as  a 
cross-check  to  current  aviation  fuel  requirement  estimates  will  provide  another  layer  of 
validation  to  the  both  model’s  predictive  capability. 
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Future  Research 


The  research  indicates  that  linear  multiple  regression,  even  when  pooling  the  data,  is  not 
the  best  method  to  develop  a  macro-level  aviation  fuel  requirements  model.  Further  analysis  to 
find  better  explanatory  variables  may  produce  favorable  results  using  linear  regression.  As 
stated  earlier  in  Chapter  Two,  the  AFCAA’s  1998  fuel  consumption  cost  estimating  relationship 
study  reveals  other  possible  explanatory  variables  that  may  better  predict  aviation  fuel 
consumption  when  employing  pooled  times  series  analysis.  Future  investigations  of  other 
potential  methodologies  are  worth  researching  such  as  maximum  likelihood  regression  or  non¬ 
linear  estimation  techniques. 

Conclusions 

This  research  developed  a  macro-level  model  that  predicted  aviation  fuel  requirements 
well  using  a  pooled  time  series  analysis  methodology.  The  validation  tests  proved  impressive  as 
the  model  predicted  historical  gallons  consumed  100  percent  of  the  time  within  a  95  percent 
confidence  and  prediction  interval.  The  results  are  impressive  but  deceiving  for  two  major 
reasons. 

First,  although  the  macro-level  regression  model  predicts  historical  aviation  fuel 
requirements  well,  the  theoretical  assumptions  for  linear  regression  fail.  The  underlying  linear 
regression  assumptions  of  normality,  constant  variance,  and  independence  are  the  foundation  to  a 
statistically  significant  liner  regression  model.  The  macro-level  model  fails  the  three 
assumptions,  thus  bringing  into  question  the  accuracy  of  individual  predictions  for  future  gallon 
consumption  requirements.  Although  the  validation  tests  report  that  the  models  predict  historical 
gallons  consumed  100  percent  of  the  time  within  a  95  percent  confidence  and  prediction  interval, 
both  intervals  are  based  upon  the  same  test  failures  essential  to  linear  regression.  Thus,  the 
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validation  results  are  fundamentally  faulty.  The  researeh  investigated  the  potential  of  breaking 
the  macro-level  pooled  data  into  sub  macro-levels  to  correct  for  the  linear  regression  assumption 
failures.  However,  the  sub  macro-level  models  revealed  the  same  theoretical  failures  indicating 
that  linear  regression  is  not  the  ideal  methodology  to  develop  a  model  to  predict  aviation  fuel 
requirements. 

Second,  the  confidence  and  prediction  intervals  provide  the  basis  to  conduct  uncertainty 
and  risk  analysis.  However,  the  standard  deviation  or  variance  statistics  that  determine  the 
intervals  are  only  valid  if  the  fundamental  assumptions  of  linear  regression  are  satisfied.  Thus, 
using  the  mean  and  standard  deviation  parameters  for  a  Monte  Carlo  simulation  to  determine  the 
quantitative  uncertainty  or  risk  in  the  models  predictions  is  fundamentally  flawed.  Therefore,  the 
research  concludes  that  the  application  of  either  the  single  or  sub  macro-level  models  is 
discourage  without  proper  understanding  of  the  underlying  statistics  provided. 
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Appendix  A:  Categorical  Predictors 


Table  7:  Detailed  List  of  Categorical  Predictors 


Weapon  System  Codes  (WSC) 

1 

A-10A 

20 

C-20A 

39 

E-4B 

58 

HC-130P 

77 

RC-135W 

2 

AC-130H 

21 

C-20B 

40 

E-8C 

59 

HH-60G 

78 

RQ-4A 

3 

AC-130U 

22 

C-21A 

41 

EC-130E 

60 

KC-10A 

79 

T-37B 

4 

AT-38B 

23 

C-22B 

42 

EC-130H 

61 

KC-135D 

80 

T-38A 

5 

B-1B 

24 

C-26A 

43 

EC-135K 

62 

KC-135E 

81 

T-38C 

6 

B-2A 

25 

C-26B 

44 

EC-135N 

63 

KC-135R 

82 

T-39B 

7 

B-52H 

26 

C-32A 

45 

F-117A 

64 

KC-135T 

83 

T-6A 

8 

C-12C 

27 

C-32B 

46 

F-15A 

65 

LC-130H 

84 

TC-130H 

9 

C-12F 

28 

C-37A 

47 

F-15B 

66 

MC-130E 

85 

TC-135S 

10 

C-12J 

29 

C-38A 

48 

F-15C 

67 

MC-130H 

86 

TE-8A 

11 

C-130E 

30 

C-40B 

49 

F-15D 

68 

MC-130P 

87 

U-2S 

12 

C-130H 

31 

C-40C 

50 

F-15E 

69 

MH-53M 

88 

UH-1N 

13 

C-130J 

32 

C-5A 

51 

F-16A 

70 

MQ-9A 

89 

UH-1V 

14 

C-135B 

33 

C-5B 

52 

F-16B 

71 

NC-130H 

90 

VC-25A 

15 

C-135C 

34 

C-5C 

53 

F-16C 

72 

NKC-135E 

91 

WC-130H 

16 

C-135E 

35 

C-9A 

54 

F-16D 

73 

OA-10A 

92 

WC-135C 

17 

C-141B 

36 

C-9C 

55 

F-22A 

74 

OC-135B 

18 

C-141C 

37 

E-3B 

56 

F-4F 

75 

RC-135U 

19 

C-17A 

38 

E-3C 

57 

HC-130N 

76 

RC-135V 

Major  Commands  (MAJCOMs) 

Mission  Type 

1 

Air  Combat  Command  (ACC) 

1 

Bombers 

2 

Air  Education  and  Training  Command  (AETC] 

1 

2 

Fighters 

3 

Air  Force  Materiel  Command  (AFMC) 

3 

Command  and  Control 

4 

Air  Force  Reserve  Command  (AFRC) 

4 

Combat  Search  &  Rescue 

5 

Air  Force  Space  Command  (AFSPC) 

5 

Electronic  Warfare 

6 

Air  Force  Special  Operations  Command  (AFSOC) 

6 

ISR 

7 

Air  Mobility  Command  (AMC) 

7 

Special  Operations 

8 

Air  National  Guard  (ANG) 

8 

Strategic  Lift 

9 

Pacific  Air  Force  (PACAF) 

9 

Tactical  Lift 

10 

United  State  Air  Force  in  Europe  (USAFE) 

10 

Tanker 

11 

Trainer 

Weapon  System  Type 

1 

Bombers 

2 

Fighters 

3 

Helicopters 

4 

Reconnaissance 

5 

Special  Duty 

6 

Trainers 

7 

Tanker  Transport 

8 

Unmanned  Aerial  Systems 
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Appendix  B:  Statistical  Tests  for  the  Sub  Macro-Level  Models 
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Figure  22:  Sub  Macro-Level  Models  Test  for  Influential  Data  Points^^ 
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Figure  23:  Sub  Macro-Level  Models  Test  for  Normality^^ 
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Appendix  C:  Correlation  Matrix  for  Macro-Level  Model 


Table  8:  Correlation  Matrix  for  Predictor  Variables^’ 


Correlation 

Maxtrix 

Combat 

FH{K) 

Training 

FH{K) 

Total 

Sorties 

C-130H 

F-15C 

T-IA 

Bombers/ 
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Combat  FH(K) 

1.000 

0.278 

0.296 

0.171 

(0.018) 

(0.014) 
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0.079 

0.044 
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0.006 
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Figure  25:  Macro-Level  Model  Correlation  Matrix  Scatter  Plot^® 


49 


Notes 


*CIA  World  Fact  Book. 

^Ibid. 

^  Petersen,  The  Road  to  2015,  146. 

^  EIA,  2008. 

^  Myers,  Ultimate  Seeurity,  177. 

^  Roberts,  The  End  of  Oil,  57. 

’  EIA,  2008. 

^Ibid. 

^  Roberts,  The  End  of  Oil,  5 1 . 

Ibid,  52. 

"  Ibid,  52. 

Ibid,  59. 

CIA  World  Eaet  Book. 

US  Census  Bureau. 

Air  Eoree  Cost  Analysis  Ageney,  Thomas  Ties. 

CIA  World  Eaet  Book. 

1  -j 

Danigole,  1. 

Ibid,  9. 

Ibid,  4. 

Ibid,  4. 

Ibid,  6. 

Ibid,  25-29. 

AECAA.  Euel  Consumption  Cost  Estimating  Relationship,  2-4. 
Schwartz,  Eorecasting  Euel  Consumption,  1-1. 

Ibid,  2-3. 

Ibid,  A- 1. 

Ibid,  A- 1. 

Ibid,  2-3. 

Ibid,  A-2,  A-3. 

Sail,  IMP  Start  Statisties  Software. 

Ibid. 

32 

Brown,  Eoreeasting  Researeh  &  Development  Program  Budgets,  36. 
Ibid,  44. 

Sail,  IMP  Start  Statisties  Software. 

Neter,  Applied  Einear  Statistieal  Models,  380. 

Sail,  IMP  Start  Statisties  Software. 

Neter,  Applied  Linear  Statistical  Models,  115. 

38 

Brown,  Eoreeasting  Researeh  &  Development  Program  Budgets,  45. 
Ibid,  45. 

Ibid,  46. 

Sail,  IMP  Start  Statistics  Software. 


50 


42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 


Brown,  Forecasting  Research  &  Development  Program  Budgets,  51. 
Sail,  JMP  Start  Statistics  Software. 

Ibid. 

Ibid. 

Neter,  Applied  Linear  Statistical  Models,  385. 

Ibid,  387. 

Sail,  JMP  Start  Statistics  Software. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 

Ibid. 


51 


Bibliography 


Air  Force  Cost  Analysis  Agency.  Fuel  Consumption  Cost  Estimating  Relationship.  January  2000. 

Brown,  Thomas,  W.  Forecasting  Research  &  Development  Program  Budgets  Using  The  Weibull 
Model.  MS  Thesis,  Air  Force  Institue  of  Technology:  Wright-Patterson  AFB  OH,  2002. 
(AD-A400571). 

Central  Intelligence  Agency:  The  World  Factbook.  November  2008. 

https://www.cia.gov/library/publications/the-world-factbook/  (accessed  November  2008). 

Danigole,  Mark  S.  "BIOFUELS:  An  Alternative  to  US  Air  Force  Petroleum  Fuel  Dependency." 
Maxwell  Air  Force  Base:  Air  University,  December  2007. 

Energy  Information  Administration.  November  2008. 

http://tonto.eia.doe.gOv/dnav/pet/pet_move_impcus_a2_nus_ep00_im0_mbbl_m.htm 
(accessed  November  2008). 

Myers,  Norman.  Ultimate  Security:  The  Environment  Basis  of  Political  Stability.  Washington, 
DC:  ISLAND  PRESS,  1996. 

Neter,  John,  Michael  H.  Kutner,  Christopher  J.  Nachtsheim,  and  William  Wassweman.  Applied 
Linear  Statistical  Models  (Fourth  Edition).  Boston:  MaGraw-Hill  Companies,  Inc.,  1996. 

Petersen,  John  L.  The  Road  to  2015:  Profiles  of  the  Future.  Corte  Madera:  Waite  Group  Press, 
1994. 

Roberts,  Paul.  The  End  of  Oil:  On  the  Edge  of  a  Perilous  New  World.  New  York:  Houghton 
Mifflin  Co,  2005. 

Sail,  John,  Ann  Lehman,  and  Lee  Creighton.  "JMP  Start  Statistics:  A  Guide  to  Statistics  and 
Data  Analysis  Using  JMP®  and  JMP  IN®  Software."  SAS  Institute  Inc.,  2001. 

Schwartz,  Lawrence,  Isabela  Castaneda,  Ronald  L.  Straight,  Avery  Williams.  "Lorecasting  Luel 
Consumption:  USCG  Aircraft  and  Cutters."  McLean:  Logistics  Management  Institution, 
January  1999. 

United  States  Census  Bureau:  International  Data  Base  (IDB).  November  2008. 
http://www.census.gov/ipc/www/idb/  (accessed  November  2008). 


52 


